36 research outputs found

    Algorithms for Hierarchical Clustering: An Overview, II

    Get PDF
    We survey agglomerative hierarchical clustering algorithms and discuss efficient implementations that are available in R and other software environments. We look at hierarchical self-organizing maps, and mixture models. We review grid-based clustering, focusing on hierarchical density-based approaches. Finally we describe a recently developed very efficient (linear time) hierarchical clustering algorithm, which can also be viewed as a hierarchical grid-based algorithm. This review adds to the earlier version, Murtagh and Contreras (2012)

    Fast, Linear Time Hierarchical Clustering using the Baire Metric

    Get PDF
    The Baire metric induces an ultrametric on a dataset and is of linear computational complexity, contrasted with the standard quadratic time agglomerative hierarchical clustering algorithm. In this work we evaluate empirically this new approach to hierarchical clustering. We compare hierarchical clustering based on the Baire metric with (i) agglomerative hierarchical clustering, in terms of algorithm properties; (ii) generalized ultrametrics, in terms of definition; and (iii) fast clustering through k-means partititioning, in terms of quality of results. For the latter, we carry out an in depth astronomical study. We apply the Baire distance to spectrometric and photometric redshifts from the Sloan Digital Sky Survey using, in this work, about half a million astronomical objects. We want to know how well the (more costly to determine) spectrometric redshifts can predict the (more easily obtained) photometric redshifts, i.e. we seek to regress the spectrometric on the photometric redshifts, and we use clusterwise regression for this.Comment: 27 pages, 6 tables, 10 figure

    Classification of aligned biological sequences

    No full text

    Student acceptance of online assessment with e-authentication in the UK

    No full text
    It has been suggested that the amount of plagiarism and cheating in high-stakes assessment has increased with the introduction of e-assessments (QAA, 2016), which means that authenticating student identity and authorship is increasingly important for online distance higher education. The investigation reported in this paper focuses on the implementation and use in the UK of an adaptive trust-based e-assessment system known as TeSLA (An Adaptive Trust-based e-Assessment System for Learning) currently being developed by an EU-funded project involving 18 partners across 13 countries. TeSLA combines bio-metric instruments, textual analysis instruments and security instruments. This study based on Responsible Research and Innovation - RRI examines the attitudes and experiences of UK students who used the TeSLA instruments. In particular, it considers whether the students found the e-authentication assessment to be a practical, secure and reliable alternative to traditional proctored exams. Data includes pre- and post- questionnaires completed by 328 students of The Open University, who engaged with the TeSLA keystroke analysis and anti-plagiarism software. The findings suggest a broadly positive acceptance of these e-authentication technologies. However, based on statistical implicative analysis, there were important differences in the students’ responses between genders, between age groups and between students with different amounts of previous e-assessment experiences. For example, men were less concerned about providing personal data than women; middle-aged participants (41 to 50 years old) were more aware of the nuances of cheating and plagiarism; while younger students (up to 30 years old) were more likely to reject e-authentication
    corecore